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1.   Introduction 

Recently,  there  has  been  a  considerable  interest  in  the  representa- 
tions of  numbers  other  than  the  conventional  positional  notation  for 
digital  hardware  calculations  [1] ;  the  concern  here  will  be  with  the 
continued  fractions.   To  facilitate  the  hardware  implementation,  we  require 
that  the  coefficients  of  the  continued  fractions  be  integral  powers  of  two. 
One  important  requirement  for  such  a  representation  to  be  useful  is  that 
it  should  be  possible  to  define  a  broad  class  of  algorithms  that  are 
easily  soluble.   It  was  shown  that  a  limited  class  of  quadratics  can  be 
solved  using  this  approach  [1,2].   This  was  later  extended  to  polynomials 
of  degree  larger  than  two  [3].   An  algorithm  for  logarithm  was  presented 
in  [k].      The  class  of  Riccati  differential  equations  is  closed  under  a 
bilinear  transformation  [5].   In  this  paper  we  show  that  a  very  large 
number  of  functions  may  be  evaluated  using  the  Riccati  equation  approach. 

As  a  result  of  the  restriction  on  the  coefficients  of  the 
continued  fractions,  the  selection  of  the  coefficients,  during  the 
interative  evaluation  of  a  function,  becomes  a  difficult  problem.  We 
require  that  such  a  selection  procedure  be  computationally  "simple. "  It 
was  shown  that  a  simple  selection  procedure  can  be  obtained  for  the 
algorithm  for  the  quadratic  equation  [1,2],   This  was  later  extended  to 
the  ploynomials  of  degree  larger  than  two  [3].   Recently,  we  have  shown 
that  for  an  algorithm  for  logarithm,  a  simple  selection  procedure  does 
not  exist  [h].      In  this  paper,  we  obtain  similar  negative  results  for 
many  functions  that  can  be  evaluated  using  the  Riccati  differential 
equation. 


An  infinite  continued  fraction  is  represented,  by, 


Pl   P2 


qi  +  q2  

where  p.  are  known  as  the  partial  numerators  and  q.  are  known  as  the 

partial  denominators.   The  classical  theory  of  continued  fractions  uses 

p.  =  1  and  q.  e  N  where  N  is  the  set  of  natural  numbers.  We  differ  from 
1         i 

this  in  that  we  require  p.  e  S  and  q.  e  S  such  that  S  and  S  are  finite 

i    P     ^    q  P     q 

and  positive  sets.   If  we  let  p  .   =  Min  S  ,  p    =  Max  S  ,  q  .   =  Min  S 

mm       p   max       p   Tmn       q 

and  q    =  Max  S  then  the  smallest  number,  m,  representable  as  an 
infinite'  continued  fraction  is  the  positive  solution  of  the  quadratic 

P  • 
mm 
m  =  


p 
max 


lax   q  .   +  m 


Tnin 


Similarly,  the  largest  representable  number,  M,  is  the  positive  solution 
of  the  quadratic 

p 
max 
M  =  


P   • 
mm 

Tiiin       q__        +  M 
Tiiax 

Let  m  -   -*-r7  ,  M   =  -P—  and  I   =  [m  ,  M  1  where  p  e  S  and  q  e  S  . 
pq   q+M  7      pq   q+m      pq    L  pq7   pqJ  P  1 

Note  that,  I   is  a  closed  interval  of  the  real  numbers.   It  can  be  shown 

'  pq 

that  [k]   the  set  of  numbers  representable  as  infinite  continued  fractions, 
using  finite  and  positive  digit  sets  S  and  S  ,  is  complete  iff 

Is  s   ^   U    I    -  [m,  M]. 

P  q    pes 

qesp 
q 


It  can  also  be  shown  that  if  S  =  {1}  and  S  c  N  then  we  have  completeness 

P    J  q  - 

only  if  S  =  N.   But  this  conflicts  with  the  requirement  of  finiteness. 
Therefore,  we  will  depart  from  the  classical  approach  either  by  allowing 

fractions  in  S  or  by  using  a  larger  set  of  partial  numerators  or  both. 

q  Pl   P2         Pn 

Let  the  finite  continued  fraction  —   —   ...   —  be  denoted 

Pn  ql  +  %  +     +  % 

by  — .   Letting  P  =  0,  Q,  =  1,  P   =1  and  Q   =  0,  we  can  evaluate  such 
y,  o      o      -J.        -j- 

n 

a  fraction  using  the  following  recursions  [6] : 


P.  ..  =  p.  .  P.  _  +  q.   P. 

l+l    -*!+!   1-1    T.  +  1   1 


Vl  =  Pi+1  Qi-1  +  qi+l  Qi 


i  =  0,  1,  . . . ,  n-1. 


Each  iterative  step  of  such  an  evaluation  requires  four  multiplications 
and  two  additions.   If  we  require  that  p. 's  and  q. 's  are  powers  of  two 
then  these  four  multiplications  can  be  reduced  to  simple  shifts  in  binary 
arithmetic.  We  will,  therefore,  require  that  such  be  the  case. 

In  the  classical  approach  to  function  evaluation,  a  finite 
continued  fraction  with  a  few  terms  is  used.   Furthermore,  the  partial 
numerators  and  partial  denominators  are  generally  positive  integral  powers 
of  the  argument,  x  [7].   This  will  clearly  require  multiplications  in 
an  iterative  step.   Our  approach  requires  that  the  partial  numerators  and 
denominators  be  simple  powers  of  two.   This  implies  that  the  complexity 
of  function  evaluation  is  transferred  to  a  selection  procedure  which 
yields  the  value  of  the  pair  (p.,  q. )  at  the  i   iterative  step.   Since 
such  a  selection  procedure,  in  general,  will  be  very  complex  (of  the  order 
of  complexity  of  the  function  to  be  evaluated)  and  since  it  will  be  used 
in  each  iterative  step,  we  are  forced  to  use  some  approximation  so  as  to 


render  it  "simple."  A  "simple"  selection  procedure  may  use  shift,  add, 

subtract  and  comparison  operations  only.   This  leads  us  to  a  discussion 

of  redundancy  [2,U]. 

Given  S  and  S  ,  if  we  have  completeness  then  the  set  of  numbers 
P     q 

representable  as  infinite  continued  fractions  will  be  called  a  number 

system  (NS).  A  number  system  is  defined  to  be  nonredundant  if  for  all 

P-,;  P^  G  S  and  a,,  q_.  e  S  ,  I     HI     is  either  null  or  is  a 
1   2    p     *1'  ^2    q'   p^    p^ 

singleton.   A  number  system  is  redundant  if  it  is  not  nonredundant.   It 

can  be  easily  shown  that  for  a  nonredundant  number  system,  all  but  a 

countable  set  of  numbers  can  be  represented  uniquely.   Therefore,  the  use 

of  any  approximation  in  the  selection  procedure  implies  that  we  use  a 

redundant  number  system. 

Two  approaches  to  function  evaluation  using  continued  fractions 

have  been  attempted.   In  the  first  approach,  the  function  to  be  evaluated 

is  f  (eO  where  a  is  a  vector  of  arguments  and  we  expand  f  (a.  )  using  the 
—\j  0  — i 

following  bilinear  transformation: 

f(a.)  =  ^=7 r 

-i     q.  n  +  f  (a.  -,  ) 

TL+1      — 1+1 y 

We  require  that  the  vector  of  coefficients  a.  ,  can  be  obtained,  from  a., 
*  -l+l  -i' 

p.  ,  and  q.   by  means  of  "simple"  recursions.  A  recursion  is  "simple" 
if  it  uses  shift,  addition  and  subtraction  operations  only.   The  algorithm 
for  the  solution  of  a  quadratic  equation  [1]  and  the  algorithm  for 
logarithm  [h]    are  members  of  this  class. 

In  the  second  approach,  we  look  for  equations  (algebraic  or 
differential)  which  are  closed  under  a  bilinear  transformation.  All  the 


functions  which  are  solutions  to  such  equations  can  then  be  evaluated. 
The  Riccati  differential  equation  is  a  member  of  this  class. 

In  Section  2,  we  show  that  a  very  large  number  of  functions  can 
be  evaluated  using  the  Riccati  equation  approach.   In  Section  3,  we  show 
that  no  simple  selection  procedure  exists  for  the  functions  discussed  in 
Section  2. 


2.   Riccati  Equation 

Riccati  equation  can  be  written  as : 

y'  +  a(x)y  +  b(x)y  +  c(x)  =  0. 


(2.1) 


Let  L  be  the  set  of  all  Riccati  equations  of  this  form.   Wynn  has  shown 
that  the  set  L  is  closed  under  the  bilinear  transformation  y  =  p/(q+z) 
where  p,  q  are  constants  [5].   Starting  with  £„   e  L,    by  a  repeated 
application  of  the  bilinear  transformation,  we  can  obtain  a  continued 

fraction  expansion  for  the  solution  to  the  initial  Riccati  equation  £   . 

2 
Let  iQ   be  given  by:  y^  =  aQ  yQ  +  bQ  yQ  +  cQ,  and  let 

yQ  =  P1/(q-)+y1  ).   Let  this  transformation  be  called  T-.  :   L  -  L. 

2 
T-lUq)  =  i  is  given  by  y|  +  &±   y^  f  b1  y  f  c  -  0. 

The  recursion  relations  for  the  coefficients  a  ,  b  ,  c  in  terms 

of  V  bo'  co  are' 


ai  =  cc/pi' 

bl  =  bo  +  2  C0  qi/pi' 

ci  =  ao  pi  '  bo  qi  +  co  q?/pi 


> 


(2.2) 


Note  here  that,  we  have  changed  the  form  of  iQ  to  avoid  negative 

2 
signs  in  recursions  (2.2).   In  general,  let  ^  =  (y^  =  a2m  J^   +  \m  J2m  +  ^m 

,  n  i    t      „     ^    +■  h     v     +  c     =  0).  Assume  that, 

and  let  l^^   =  (y^m+1  +  a2m+1  y2m+1  +  ^2m+1  y2m+1  +  c2m+i    ' 

a  =  T  T       t  (i  )  has  been  obtained.  Then  the  coefficients  of 
n    n  n-1  * ' '   1  0 

a    _  T   (o    )   are  given  by  the  following  recursions: 
n+1    n+1  n 


n+1  ~  "n'^n+l 


c-7p 


Vl  =  bn  +  2  °n  V/V 

2 


(2.3) 


cn+l  =  an  Pn+1  +  bn  Vl  +  °n  V/Vl  ' 

As  a  result  of  these  transformations,  we  have  expanded  yQ  to  n+1  terms  as 

follows : 

!i  \  Pn+1  (2.1+) 

y0  "  q1+  ^  +  +  W+yn+l 

Let  P   /ft    denote  the  finite  continued  fraction  obtained  by  setting 

v    =  0  in  equation  (2.10.   If  we  assume  that  |y |  <  M  where  M  is  a  fixed 

^n+l 

constant  then  clearly,  the  fraction  Pn/Qn  converges  to  yQ.   By  setting, 

P  =  0,  Q  =  1,  P1  =  Px  and  ^  =  q^  the  recursions  for  Pn+1  and  Qn+1 
are  [k] : 

Pn+1  =  Vl  Pn  +  Pn+1  Pn-1   1 

\  (2.5) 

Vl  =  Vl  %.  +   Pn+1  Vl   J 
Thus  if  we  have  a  method  to  correctly  choose  Pq,  q^  for  every  n  then, 
we  have  an  algorithm  to  solve  the  Riccati  equation. 


2.1  Power  of  the  Method 

We  will  now  discuss  the  number  of  functions  that  can  be  obtained  by 
the  method  of  Riccati  equation. 

2.1.1  Constant  Coefficients 

Let  us  consider  a  subset  L  of  L  such  that 

2  , 

L  =  {y  +  ay  +  by  +  c  =  0  |  a,  b,  c  e  R}  i.e.,  the  set  of  all  Riccati 

equations  with  constant  coefficients.   Consider  £     e  L  given  by, 

2  2 

y^  =  aQ  yQ  +  bQ  yQ  +  c  .   Depending  on  the  sign  of  A  =  bQ  -  4  aQ  cQ,  the 

solution  yn(x)  of  £     can  be  written  as, 


y0(x)  =  2a^   (W^*  +  AQ)  -  ^ 


if  A  <  0  and  a  /  0; 

1    bo 
Vx)  =  -  a^  "  2TQ    f  A0     if  A  =  0,  aQ  /  0; 

y0(x)^(tanh(^x+A0)-^) 

if  A  >  0,  a.     /   0; 

V 
y0(x)  =  A0  e    +  c0  X      ^  aQ  =  0. 

Depending  on  the  values  of  the  coefficients  a_,  b  ,  c  and  the  initial 
condition  t  =  yn(0),  many  different  functions  may  be  evaluated  as  shown 
in  the  following  table. 
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ao 
1 

k 

i 

o 

o 

1 

A 

*0 

y0(x) 

0 

i 

!    i 

i 

-k 

0 

tan  x 

:  -1 

!    o 

1    -1 

-k 

CO 

cot  X 

-1 

0 

0 

0 

00 

l/x 

-1 

!       0 

1 

h 

00 

cot  h  x 

-1 

!  o 

1 

k 

0 

tan  h  x 

0 

i 

1  +i 

1      0 

i 

X) 

1 

+x 
e— 

Table  2 

1 

2.1.2  Variable  Coefficients 


Consider  a  subset  L  of  L  so  that, 


'j  

L  =  {y1  -  a(x)  y  +  b(x)  y  +  c(x)|a(x)  =  k(x)  a, 


b(x)  =  k(x)  b,  c(x)  =  k(x)  c,    and  a,,   b,  c 


are  constants} . 


Recursions  for  a  , ,  b  n  and  c   n  can  be  derived  from  the  recursions  (2.3) 
n+1'   n+1     n+1  v   ' 

and  are  as  follows : 

a.  ,  =  c./P- , i 
1+1    v  ^l+l 


> 


i+1 


=  b.  +  2c.  q.  -,/p.  -, 
i     i  Hi+l'^i+l 


(2.6) 


i+ 


.,  =  a .  p.  ,  +b.  q.  .  +  c .  q .  -,/P-  -,  • 
1    i  ^i+l    i  u+1    l  ^l+l'  *i+l 


-2 


Depending  on  the  sign  of  /L  =  b  -  ^-an  cn,  the  solution  to  JL  is 


given  by: 


and 


•J -4.     v-4^  f  D0 

7„(x)  =  -?  (tan(-^2  Jk(x)  dx  +  A  )  -  — ) 
°      2aQ  ^ 


if  ^  <  0,  aQ  /  0; 


0       aQ  k(x)  dx    2aQ 


if  lsq  =  0,   aQ  ^  0; 


yn(x)  -  -  -2  (tan  h(-=2  Jk(x)  dx  +  A  )  -  — ) 

°       2*n  ^ 


if  ^  >  0,  aQ  ^  0; 


b  fk(x)  dx     c 
y0(x)=A0e°J        -  =2   if  a0  =  0,  bQ/0; 


^0' 


^x)  =  cQ  Jk(x)  dx  +  AQ   if  aQ  =  bQ  =  0. 

Clearly,  a  large  class  of  functions  can  be  evaluated  with  this  method. 

2.1.2.1  The  Case  With  2L  =  0 

In  this  section,  we  will  concentrate  on  a  subset  L   of  L  such  that, 
L,0  =  (1  e  L  k  =  0).  Any  £  e    L   can  be  rewritten  as:  y'  =  k(x)(a*y+b*) 

where,  a*  =  v a,  b*  =  a*(— ).  With  this  modification,  we  have  reduced  the 

2a 

number  of  coefficients  from  three  to  two.  The  recursions  on  a*,  b*  can  now 

n'   n 

be  written  as  follows : 


10 


*\ 


a*  ,    =  Wn/p~Tt 
n+1         n'    *n+l' 


The  solution  &     e  I^q  is  given  by, 


y  (x)     =    p 7~7~s T 

(a5f(A0  -Jk(x)  dx) 


b* 


y0(x)     =     (b*)2/k(x)  dx  +  AQ 


bn+l  =   (an  pn+l  +  K  VlV^nTl        ^ 


-  •£     *■*  *$*<>> 


if  a*  =  0. 


(2.7) 


Note  that,  we  can  integrate  the  given  function  k(x)  by  this  method  by 

■x- 

0 


setting  a*  =  0  and  b*  =  1. 


2.2  Implementation  Considerations 

Let  us  assume  that  simple  selection  procedures  are  available  for 
all  the  functions  to  be  evaluated  as  detailed  in  section  2.1.  We  now  give 
steps  of  an  algorithm  T  which  will  evaluate  these  functions. 
Algorithm  T: 
Step  1:   [Initialize] 

Set  PQ  -  0,  QQ  <-   1,  P_1  <-  1,  Q_x  <-   0; 

Set  initial  values  of  coefficients  according 

to  the  function  to  be  evaluated; 

Set  i  <-  0; 
Step  2:   [Select] 

(P-  -i>  <L  -i )  *-  Select(x,  coefficients,  function); 


11 


Step  3 •      [Recursions] 

pi+i  *■  Vi  pi  +  »i+i  pi-i' 
<W  -  Vi  \  +  pi+l  \-v 

Recur se  using  equations  (2.3),  (2.6)  or 

(2.7)  whichever  is  applicable. 
Step  k:      [Test] 

After  'sufficient'  number  of  iterations 

GO  TO  Step  5;  otherwise  set  i  *-  i+1, 

and  GO  TO  Step  2; 
Step  5:   [evaluate] 

y0(x)  =  f(x)  =  P1+1/Q1+1; 
END  T; 

In  any  such  iterative  algorithm,  the  number  of  iterations  required 
and  the  execution  time  required  per  iteration  are  two  important  considerations. 
In  each  iteration,  steps  2,  3  and  h   are  executed.   Clearly,  step  2 
and  3  require  more  attention.  We  can  assume  that  if  the  procedure  Select 
is  known,  it  can  be  implemented  in  a  combinational  network  and  therefore, 
will  require  very  little  time.   In  step  3,  we  see  that  all  the  assignments 
are  independent  of  each  other  and  therefore,  can  be  executed  in  parallel. 
Thus,  given  sufficient  hardware,  step  3  can  be  speeded  up  considerably. 
Each  individual  recursion  requires  additions  (subtraction),  multiplications 
and  sometimes  division  also.   Since  multiplication  and  division  are  relatively 
slower  operations,  we  would  like  to  avoid  them  if  possible.   If  we  restrict 
the  coefficients  p.  and  q.  to  be  integral  powers  of  two  these  multiplications 
and  divisions  will  be  reduced  to  shifts,  which  is  relatively  a  faster 
operation.   If  we  use  the  recursions  (2.7),  then  we  further  require  that 
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p.  =  1  for  V  i.   We  will  assume  that  p.  e  S  ,  where  S  =  f 2J I i  is  an 
1  l    p        p      ' 

integer}  and  q.  e  S  where,  S  =  {2J|j  is  an  integer}.   Since  the  number 

of  shifts  available  is  finite  we  further  require  that,  S  =  f2^|j  <  ,i  <  J  1 

P      -p  _      P 

and  S  =  f2J|j  <  i  <  J  }  where  J.J.J  and  J  are  fixed  integers, 
q      '-q  -  d  -     qJ        P  ~ P   q     ~q  & 

The  number  of  iterations  to  be  carried  out  can  be  decided  on  the 
basis  of  allowable  error  in  the  result. 


2.3  Initial  Condition 

Associated  with  the  solution  of  any  differential  equation,  there 
are  one  or  more  arbitrary  constants  which  are  evaluated  using  the  boundary 
conditions  imposed.   Depending  on  the  Function  f (x)  to  be  evaluated,  we 
choose  a  particular  iL  e  L  (and  the  corresponding  coefficient  values)  and 
the  associated  initial  condition  yo(0)  =  t„  so  that  yn(x)  =  f(x).   Clearly, 
the  initial  condition  on  i.  (i  >  l)  is  dependent  on  t_.   In  particular, 

yn-l  =  Pi/^V^n^  which  ^P116^ 

^-1  =  Pi/^gn"Hbn^  which  ^P1163^  (2-9) 

t   =  p  /t    -  q 
n    n  n-1    ti 

As  we  will  see  in  Chapter  3^  t  is  needed  as  an  argument  in  a  selection 

procedure  for  p  n  and  q  _ .   Therefore,  we  need  to  evaluate  t  in  every 
r  n+1     -n+1  '  n 

iteration.   This,  however,  implies  that  a  division  be  carried  out.  We 

can  avoid  the  division  by  the  following  technique. 

Let  t  =  d  /e  then,  from  equation  (2.8), 
n    n  n 

d  p  e  _ 

n  n  n-1 

e      ti     d  ., 
n  n-1 


13 


From  which, 


d  =  p  e   ,  -  q  d  n 
n    n  n-1   ti  n-1 

(2.9) 


e  =  d 
n    n-1 


and  d  =  t     and    e  =  1. 

If  the  selection  procedure  can  choose  with  the  help  of  d  and  e 
(does  not  explicitly  require  t  )  then  we  have  solved  our  problem.   Now  in 
step  3  of  algorithm  T,  we  have  to  carry  out  recursions  (2.9)  as  well. 

3.   Selection  Procedures 

We  have  seen  that  the  form  of  the  solution  to  a  Riccati  equation 
depends  on  the  sign  of  the  discriminant  A.   It  is  also  clear  that  the 
selection  procedure  will  be  different  for  different  forms  of  the  solution, 
i.e.,  depending  on  the  sign  of  A.   Therefore,  if  A  remains  invariant 
under  the  bilinear  transformation  then  hopefully  the  same  selection 
procedure  can  be  used  consistently  during  the  iterative  evaluation  of  a 
function.   It  can  be  easily  seen  that  this  is  indeed  the  case,  i.e., 

A.  =  A^  -  ...  =  Aq. 

In  Section  3.1,  we  consider  selection  procedures  for  Riccati 
equations  with  constant  coefficients,  and  in  Section  3.2,  we  consider 
the  more  general  case  of  variable  coefficients. 

3.1  Constant  Coefficients 

We  will  consider  two  subcases  separately  depending  upon  the  value 
of  the  discriminant  A. 


Ik 


3.1.1  The  Case  With  A  <  0 

Consider  I   such  that  y.'  =  3  (a  y  +  b±  y±  +  c±)  where  a.  /  0  and 
j  =  1  if  i  is  even  and  -1  otherwise.   The  solution  to  this  equation  is 
given  by, 


yiW    2a. 
1 


A  ^. 

ban  (^  x  +  A  )  -  -i 


(3.1) 


If  we  let  the  initial  condition  be,  y. (0)  =  d./e  then  we  can  evaluate 

the  arbitrary  constant  A.  by  substituting  the  initial  condition  in  equation 

(3.1).   Thus, 

A  =  j  ££  (tan   (A.  )   -  jbi/r  A)   from  which' 
e.  ^a  1  v -A 

1  x 


2a.    d.    +  b.    e. 

/  1  X  X  X  N 

A.    =  j   arctan   ( j  . 

1  e.  n/-A 

x 


Substituting  in    (3.1),    we  get, 


yt(x)  =  3  ^ 

X 


2a.d.+b.  e.  -j 
, «  -A     \         •        x  x     x  x 
tan   (-~  x)   +  j 

e.  \/-A 
x 

2a.d.+b.i. 

.    -A        \        X     X        X     X 

1  -  j   tan(—  x) 

e.  v-A 
x 


2a. 

x 


j-sT-A  tan   e~  x)-e.  /-A  +  \T-A  (2a.d.-+b.i.  )   -  b.  (e.V-A  -  j 


XI        XX'  XX 


2a.    (e.  v -A  -  j   tan 

xx 


^  x)    (2a.d.+b.X.  ) 
2  x   x     x  x 
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j  tan   C^x)    [-   e.    A  +  b.    (2a.d.+b.e.  )]    +  n/~-A  (2a. d.  ) 
J  v  2        '    L        i  i    v     i  i     l  iyj  v      i  iy 


2a.      e.  /-A-  j   tan   (^  x)    (2a.d.+bJ.)] 
li  2  ill   l    J 


j    r.    u  +    (v-a)   d. 


(v-A)   e.    -  j   h.    u 


(3.2) 


where  r.    =  2c.e.+b.d.,    h.    =  2a.d.+b.e.    and  u  =  tan   (— —  x).      It  is  clear 
l  11111  1111  2 

that  the  process  of  selection  will  involve  r.,  h. ,  d.  and  e.  but  not 

1111 

a.,  b..  and  c..   Therefore,  if  we  could  obtain  recursions  for  r.  and  h. 
ill  11 

which  are  free  of  a.,  b  and  c.  then  we  will  avoid  the  computation  of 

i   i      i 

a.,  b.  and  c. .  We  will  now  derive  the  recursions  for  h.  and  r.  using  the 
lii  11 

recursions  for  a,.,   b.  and  c.  and  a  slightly  more  general  form  of  recursions 

for  d.  and  e.  than  those  used  in  (2.Q).   The  recursions  for  d.  and  e.  are 
11  11 

as  follows : 


L    -,    =  k.    ,    (p.    n   e.   -  q,  .,   d. ) 

i+I         i+I   VJ^i+l     i       Hi+1     iy 


and 


Now, 


e.    ,    =  k.    -.  d.  . 

i+I         i+I     i 


h.    -,    =  2a.    ,    d.    ,    +  b.    ,    e.    , 
n-1  it-1     i+I         i+I     i+I 


2(c./p.    , )   k.    ,    (p.    ,    e.    -    q.    ,    d. )    + 
v   i'-^i+l'      1+-1   Vii+1     l        T.+1     1' 

(b.    +  2c.    q.    ,/p.    , )   k.    -    d. 
v   l  l   Hi+l/^i+l/      i+I     l 


2k.    ,    c.    e.    +  k.    ,   b.    d. 
i+I     l     l  i+I     l     l 


k.  , ,    r.  . 

i+I     l 
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In  a  similar  way,  we  can  obtain  the  recursion  for  r. .  As  a  result,  the 
set  of  recursions  that  we  will  use  is  as  follows : 


h.  ,  =  k.  ,  r. 
l+l    l+l  l 


?.    -,    =   k.  .,  (p.  ,  h.  +  q.  _.  r.  ) 
l+l    i+l  v±ihl  i   Hi+1  ±J 


d.  ,  =  k.  _  (p.  ..  e.  -  q.  .,  d.  ) 
i+l    i+l  ^l+l  l   Hi+1  iy 


e.  ,  =  k.  ..  d. 

i+l    i+l  i 


^ 


!3.3) 


J 


The  condition  for  the  selection  of  a  (p, q)  pair  is  given  by: 

y. (x)  e  I  .   In  other  words,  the  selection  condition  is:   If 
l       pq 


j  r.  u  +  v -A 


<  M   then  choose  (p, q).   Note  that,  we  cannot 


m   < 

Pq~V-Ae.  -  j  h.u  "   pq 
11 

use  this  condition  directly  since  u  is  an  unknown,  therefore,  we  would 
like  to  rewrite  the  selection  condition  as  follows: 


arc  tan  (AEG.  (m  ) )  <    ApJ'  X  <  arc  tan  (AEG.  (M  ) ) 


i  pq 


i  pq' 


(3.k) 


where 


ARG. (s) 
l 


v  -A  e.  s  -  v-A  d. 


r.  +  s  h. 
i      l 


Note  that  such  a  rewriting  is  valid  if  both  of  the  following  conditions 
are  satisfied:   (l)  AEG.(s)  is  a  monotone-increasing  function  of  s,  and 
(2)  arc  tan(z)  is  a  monotone-increasing  function  of  z.   Since  condition  (2) 
is  already  known  to  be  satisfied,  we  only  have  to  verify  the  condition  (l). 
To  do  this,  note  that, 
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Now 


dARG.  (s)     (r-.+h.sX-sT-Ae.)  -  h.N/~-A(e  s-d±) 

**~  (r.+h.s)2 

1  l 


=  ■sT-Afr.e.+h.d.  )/(r.+h  s)' 
li  i  l  '   i  i 


\  ,  e.  n  +  h.  ,  d.  ,  =  k.  .  (p.  .h.+q.  ,r.  )  k.  d.  + 

i+1  i+I    l+l  l+l    l+l  ^l+l  i  T.+1  iy   ii 


k 


.  .  r.  (p.  -.e.-q.  nd.  ) 
l+l  l  *!+!  l  ^i+1  i 


2 


k.  ,  (p.  ,h.d.  +p.  ..r.e.  ) 
i+I  v±i+l  i  i  *i+l  i  iy 


2 
p.  ,  k.  ,  (r.  e.+h.d.  ) 

*!+!   1  +  1    11   11 


Therefore, 


r  e  +  h  d   =   (  n  (pk*))  (roe0+hod  )  . 

0  =  1  d 

Therefore,  ARG. (s)  is  a  monotone-increasing  function  of  s  provided 

r  e  +  h  d  >  0.   Observe  that  there  is  no  loss  of  generality  in 

u  u   u  o  i 

assuming  that  r  e  +  h  d >  0.   Since  if  r  e  +  h  d  <  0  then 

ARG  (s)  will  be  a  monotone-decreasing  function  of  s  and  we  can  turn  the 

i 

inequality  (3.^0  around  and  follow  very  similar  arguments.  Also  note 
that  the  condition  r  e  +■  h  d  =  0  will  not  occur,  since  this  implies 
that  either  t   (the  initial  condition)  is  complex  or  dn  =  e  =  0  or 
a    0. 

In  theory,  the  selection  condition  (3.*0  can  be  used  to  select 
the  (p,q)  pair  during  each  iterative  step,  but  the  amount  of  computation 
involved  is  clearly  excessive.   We  note  that  in  order  to  compute  a  boundary 
of  a  selection  region,  arc  tan  (ARG. (s))  needs  to  be  computed  and  there  are 
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as  many  as  S  x  S   selection  regions.   It  is,  therefore,  clear  that  we 
i  pi  i  qi 

would  like  to  use  an  approximation  to  arc  tan  (AEG. (s))  which  is  "easy" 
enough  to  compute  from  the  available  coefficients  h.,  r.,  d. ,  e.  and  the 

1X11 

known  value  of  s.  We  note  that  the  use  of  an  approximation  in  the 
selection  procedure  implies  the  use  of  redundancy  in  the  digit  sets  since 
otherwise  we  cannot  guarantee  correct  selection. 

With  the  use  of  redundancy,  there  will  be  regions  in  which  more 
than  one  (p, q)  pair  can  be  chosen.   Define  I     <  I     if  there  exists 

¥1      ^ 

f  e   I  such  that  for  all  gel     a>f<g.      A  pair    (pp,cu)   is   said  to 

piqi  P2q2 

be  right-adjacent  to  a  pair   (pn,q.)   if  I  <  I  and   for  all   (p-.q,,) 

11      P1q1    P2q£  3  3 

such  that  I     <I    ,1     <I    .A  similar  definition  of 
plql    p3q3  v&>   "  P3q3 

left -adjacency  can  be  given.   Given  a  pair  (p-,,q-,)  its  left -adjacent 

pair  (pp,^)  and  the  right-adjacent  pair  (p~.>qo)>  ^he  following  holds: 

If  f  e  I     H  I     then  we  can  choose  (p0,q0)  or  (p.,qj,  if 
plql    p3q3  3  3     v-*!'-*!" 

f  6  I     (1  I     then  we  can  choose  (p, ,qn)  or  (p~,q~,)  and  if 
Piqi    V2^  ^1^1^     \*2'^2' 

f  e  I     -  (I     HI    )  -  (I     (1  I    )  then  we  must  choose  the  pair 
plql     plql    p3q3      Plql    ^ 

(p  , q  ).  We  note  that  the  existence  of  selection  overlap  regions  such 

as  I     0  I     allows  us  to  use  an  approximation  in  the  selection 
plql    p3q3 

procedure.   Let  us  denote  the  approximate  value  of  arc  tan  (AEG.  (s))  by 
AT. (s),   then  the  selection  rule  to  be  used  can  be  specified  by: 

If  ATi(z1)  <  "^  j  x  <  ATi(z2)  then  choose  (p^q^  (3.5) 


where  z,  e  I     D  I     and  z„  e  I     HI    .   Note  that  zn,  z^  will 
Plql    P3q3  Plql    V2^2 
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now  be  a  "boundary  between  adjacent  selection  regions  and  therefore  the 
selection  of  the  (p,q)  pair  will  now  be  unique.   In  order  to  guarantee 
correct  selection  using  condition  (3.5)>  we  have  to  show  that  the  region 
specified  by  condition  (3.5)  is  a  subset  of  the  region  specified  by  the 
condition  (3.k).      From  this,  we  can  say  that  the  maximum  error  allowable  in 
the  computation  of  arc  tan  (ARG.  (s)),  denoted  by  E.,  is  given  by: 

E.  =  Max  [arc  tan  (ARG.  (M    ))  -  AT.  U), 

AT.  (z_)  -  arc  tan  (ARG.  (m    ))]. 
l  r  iv  p  q  yyj 

In  other  words,  we  can  find  s.  and  sp  (sp  >  s  )  such  that, 
E.  <  arc  tan  (ARG. (s  ))  -  arc  tan  (ARG.  (s  )). 
Now  we  note  that,  arc  tan(z)  satisfies  the  IApschitz  condition,  i.e., 
| arc  tan(z„)  -  arc  tan(z  )|  <  L|zp-z  | 

for  L  >  0  and  L  <  II.   Therefore, 

E.  <  L  (ARG.(s2)  -  ARGi(s1)).  (3.6) 


Now, 


II.   =  ARG.(s2)  -  ARGi(s1) 

^-A  (eis2-c3i)     /-A  (eis1-di) 

(r.+h.s0)         (r.+h.  sn  ) 
112'  ill 

(riei+hidi)  (vT-A)  (s2-Sl) 
(Slh.+r.)  (s2h..r.)     ' 


20 


Using  an  expression  derived  for  r.  e.  +  h.  d.  earlier,  we  have, 
to  1111 

/-A(s2-Sl)  (^p.k^)  (r0e0+h0dQ) 

Hi  =  (snh.+r.]"(s"h.+r.) (3-7' 

1  i  i    2  i  x 

We  are  now  interested  in  eliminating  h.  and  r.  from  the  expression  of  H. . 


Towards  this  end,  we  will  show  that, 


r.    =  rn  K.    Q.    +  hn  K.    P. 
l  Oil  Oil 


where 


K.    =     JI    (k.). 

1    j=i  > 

We  proceed  to  prove  this  result  by  induction  on  i.   Since  P  =  0,  Q  =  1 
and  K  =  1,  we  have  r  =  r  *1'1  +  h  *1*0  =  r  .   Now  recursions  (3.3); 
we  have, 

ri  =  ^^WW  =  ro  Ki  Qi  +  ho  Ki  pr 

Now  assume  that  the  required  result  is  true  for  r..   For  j  <  i.  Again 

J 

from  recursions    (3. 3); 

r.    -    =  k.    ,  (p.    nh.+q_.    ,  r.  ) 
1  +  1  1+1^1  +  1   i     tL+1   iy 

=  k.    ..  (p.    ,k.r.    ,+q.    ,r.  ) 
1+1V±1  +  1   1    1-1    tL  +  1   iy 

-  k.    ,(p.    _k.  (r_K.    _Q.    -,+hK.    nP.    .)    +   q.    .  (rJC.  Q.  +h_K.  P.  ) ) 

i+lv±i+l  iv    0  l-ll-l     0  l-l  l-l  ti+lv    0  11      0  l   iyy 

-  r„  K.  .  (p.  _Q.  n+q.  ,Q.  )  +  h^  K.  .  (p.  nP.  _+q.  _p.  ) 

0  i+l^i+l^i-l  ^l+l^i'    0  i+l^i+l  l-l  Hi+1  iy 

=  r  K.  1    Q.  .  +  h  K.  '  P.  , ,  . 

0  i+l  i+l    0  1+1  1+1 
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Thus,  we  have  the  required  result.   It  follows  from  this  that 

h.  =  k.  r.  _  =  K. (r  0.  .+hnP.  .) 
i    x  l-l    l  0  l-l  0  i-ly 

Now  substituting  these  expressions  for  h.  and  r.  in  the  equation  (3.7), 

we  have, 

H       =     l=i- 

V'lt'Ad+Vw'  +  ro  Qi  +  ho  V^^oWVi-i) 


+  roQi  +  hoPi]- 

Substituting  this  in  the  expression  (3.6),  we  have, 

i 

Up  )  l  (r0e0+h0a0)  -f-A  (s2-Sl) 

El  -  [Sl(r0Qi-lth0Pi-l)+r0Qi+h0Pi"S2(rSQi-l+h0Pi-l)tr0Qi+h0PiJ  ' 

Now  we  consider  two  cases,  depending  upon  the  value  of  r_.   If  r  /  0 
then  we  have, 

Up.,) 

Ei  *  \    i=^-  (3-8) 

1    1-1 

since  P.,  Q. ,  P.  ..,  Q.  n,  sn,  s_  are  all  >  0  and  where 
l   i   l-l   l-l   1   2 

r^e.+h^d-   s_-sn 
B   =  L  (  °  °  °  °)  (  S_l)  /-A  . 

r0        2 

On  the  other  hand  if  rQ  =  0 

i 

(  np.)  LhQ  dQ/-A  (s2-s1) 

Ei  s  -^ ■ — 

h0(slPi-l,Pi^S2Pi-lfPi) 
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(  up.) 

<  J°l3 
-  pi  pi-i 


s^  J  tu 


(3.9) 


We  will  now  obtain  a  bound  on  P.  P.  n  in  terms  of  Q.  Q.  n.  A  well  known 

i  l-l  i  i-l 

property  of  the  convergent s  of  an  infinite  continued  fraction,  f,  can  be 
written  as  [6] : 


P         P 
0  2 

^^<..,<f< 

P. 


P  P 


P.        P0              P   . 
i                                                          j.          d.               min 
Therefore,    if  i   is   odd,    -r—  >  m.      If  i  >  2   is   even,    —  >  —  >  


Therefore, 


Tiiax 


max 
Tnin 


P.  P.    , 

i  i-l 

0,  '  Q.    -, 

i  i-l 


m  p 


> 


mm 


Tnax 


•max 


tii: 


.in 


Substituting  this  in   (3.9)j    we  have, 


2      1        r 

where  B~   =  

2  s^ 


(  np.) 

,i=i  J 


0 


h 
0 


m  p   . 
mm 


TTli 


ax 


max 
Tiiiri 


(3.10) 


From   (3.9)   said    (3.10),    we  have, 


E.    < 


3=1  3 


±-      QiQi-l 


where  B  =  B  if  r  /  0  and  B  otherwise.   Note  that  B  is  a  fixed,  finite 
and  bounded  constant  independent  of  the  value  of  i.   The  factor 


(  K  p.)/Q.  Q.  -.    can  be  interpreted  as  the  error  in  the  solution,  since 


,1=1 
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it  equals  the  difference  in  values  of  the  successive  convergent s  P.  ,/Q.  , 
and  P./Q.  [6].   Therefore,  if  we  demand  linear  convergence  then  we  must  have, 

Tip 

1=1  d  -i 

—■, r —  =  constant  •  a 

Q.Q.  , 
i  l-l 

for  a  small  positive  constant  and  some  a  >  1.  As  a  result,  we  have, 

E.  <  B*  •  en-1  . 

l  — 

But  this  implies  that  the  computation  of  arc  tan  (AEG.  (s))  must  be  carried 
out  to  nearly  the  same  precision  as  that  of  the  desired  precision  of  the 
function  being  evaluated.   Thus  we  conclude  that  we  cannot  obtain  a 
computationally  simple  selection  procedure  for  the  functions  that  can 
be  evaluated  using  the  Riccati  equation  with  constant  coefficients  and 
A  <  0. 


3.1.2  The  Case  With  A  >  0 

Consider  the  following  Riccati  equation: 

2 
y[   =  J'(aiyi+biyiH-ci) 

such  that  A  :  A  >  0  and  j  =  1  if  i  is  even  and  -1  otherwise.   The  solution 
to  this  equation  can  be  written  as, 

b. 


.jWa  + 


yi(x)=^coth(J*p  +  AJ-^-  (3.11) 

where  A.  is  an  arbitrary  constant  of  integration.   Using  the  initial  condition 

"/a  e. 

y.  (0)   -=  t.    =  d./e.,   we  obtain  tanh  A.    =       -  « — 3—7; •      For  "the  sake 

113/1'  1  ^a.d.+b.e. 

11     11 


2k 


of  brevity,    we  let  h.  =  2a.    d.    +  b.    e.    and  after  substituting  for  A.    in 

1  1     1         i     1  i 

(3.11),    we  get, 

,      t  j   A  e.     tanh     {-%-)   -  j  b.    h.     tanh     (—-)   -  nTa  2a.    d.    ^ 

v    (y-) L-  I  x £ x      1 E x      x     \ 


From  which,    we   get, 

Jtanh     (LT)-        (y.h.+r.  (3.02) 

11     l 

where   r.    -  b.    d.    +-  2c.    e.  .      From  equation    (3.11),   we  note  that  if  e„  =  1, 
11111  0 

dQ  =  0,    hQ  =  0  and  rQ  =  n/a  then  yQ(x)   =  tan  h   f^).      If  eQ  =.0,    dQ  =  1, 


h     =  -  vA  and  r     =  0  then  yn(x)  =   coth   (— p— ) .      If  c _  =  0  and  an  =  0  then 

bQx 
we  have  y   (x)   =  A     e 

From  the  form  of  the  equation  (3.12)  and  the  definitions  of  r.  and 

h.,  it  is  clear  that  we  can  follow  the  same  arguments  as  in  Section  3.1.1 

and  prove  that  a  computationally  simple  selection  procedure  cannot  be 

obtained  in  the  case  that  A  >  0  or  a  =  0.   Thus  we  have  shown  the  negative 

results  for  the  Riccati  equation  with  constant  coefficients. 

3.2   Variable  Coefficients 

We  will  only  consider  the  case  with  A     =  0,  i.e.,  we  consider  the 
subset  L   of  L.   Consider  the  equation 

y^  =  j  k(x)  (&±y+b±f  (3.13) 

where  ,j  =  1  if  i  is  even  and  zero  otherwise.  We  will  use  the  following  set 
of  recursions: 
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a.    ,    =  b./vp.    -. 
l+l  i'    *i+l 


"b.    ,    =  a.  vp.    ,    +  b.  q.    ,/vp.    , 

l+l         l     ^l+l          l  tL+1'    *i+1 

d.    ,    =  e.  vp.    ,    -  d.  q_.    ,/vp.    , 

l+l          i     •*!+!          i  tL+1'    -^1+1 


e.    ,    =  d.A/p.    , 
l+l  i'    *i+l 


J 


(3. Hi) 


The   solution  to  this   equation  is   given  by: 

cL    +  j(g(x)-g(0))  bi(aibi+d.ei) 
yiU)   =   e.    -  j(g(x)-g(0))   ai(aibi+diei) 


(3.15) 


where  g(x)  -  /  k(x)  dx.   To  simplify  the  equation  (3.15)>  we  can  easily 
prove  by  induction  on  i,  that 

a.  b.  +  d.  e.  =  an  b_  +  d_  e_  =  r.  . 

11    11    00    00    0 

Note  that  (using  the  recursions  3.1*+)> 


a.  .  b.  .  +   d.  ,  e.  . 
l+l  l+l    l+l  l+l 


=  b./vp.  n  (e.vp.  ,-d.q.  ,/vp.  ,  )  h 

i'  *i+l  i  *i+l  1T.+1'  *i+l ' 

(a.vp.  .  »•  b.  q.  ,A/p.  ,  )  d.A/p.  . 

v  l  l+l  i   l+l'  *i+l'   i'  *i  +  l 


-  a.  d.  +-  b.  e.  . 
ii    li 


Using  this,  we  get 


\    y   j(g(x)-g(0))  b.  rQ 
yiU)  r  o.  ,  j(g(x)-g(0))  a.  rQ 
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The  selection  condition  can  now  be  written  as:   If 
d.  +  j(g(x)-g(0))  Id.  r. 


< T7 — 7 — ? 77TYT <  M   then  choose  (p,q), 

-  e.  +  j(g(x)-g(0))  a,  r.  -   pq  v^'Hy 


m    < 

i  0 


Since  g(x)  is  the  unknown  we  want  to  transform  the  selection  condition  to: 

,  fj(M  e.-d.+jM  g(0)a.r_)^ 
-1  /     pq  j  i   pq&v   i  0'    \ 

L       I      b.r.+M  a.r_       J   ~ 
v       l  0  pq  l  0       ^ 


,  .  j(m  e  -d  +om  g(0)a.r  )  | 

^      rn  b.+m  a.  )        j  U.J-?; 


But  this  transformation  is  valid  provided,  ARG. (s)  is  a  monotone-increasing 
function  of  s  and  g   (z)  is  a  monotone-increasing  function  of  z.   Note  that, 

s  e  -  d  +  j  s  g(0)  a  r 

ARG.  (s)   =  i ^ , i— 0  m 

i  r  Jb.+sa. ) 

0  l   l 

Therefore, 

^ARGi(s)  r0(bi+sai)(ejL+jg(0)air0)-(sei-d  +jsg(0)air0)r0ajL 

r    (b.+sa. ) 
0     i        i 

1  +  j   g(0)   a±  b± 

(b. +sa. ) 
i        l 

For  simplicity,  we  assume  g(0)  =  0  then  clearly,  ARG. (s)  is  a 
monotone -increasing  function  of  s.  We  also  assume  that  g   (z)  is  a 
monotone-increasing  function  of  z.   If  it  is  a  monotone-decreasing  then 
we  can  turn  the  inequality  (3.15)  around  and  similar  arguments  can  be 
carried  out. 

The  inequality  (3.15)  can  be  split  up  into  two  parts  depending 
upon  the  value  of  i.  We  will  only  consider  the  case  when  i  is  even,  the 
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other  case  being  very  similar.   Then  the  selection  condition  is: 

^_1(AEG.  (m   ))  <  x  <  g_1(ARG.  (M   )). 
r  pq   -   -        1  pq 

Now  since  g  (ARG. (s))  is  difficult  to  compute  in  general,  therefore,  we 
would  like  to  use  an  approximation.  The  maximum  error  allowable  in  such 
an  approximation  can  be  written  as, 

Ei  =  g_1(ARG.(s2))  -  g"1(ARGi(s1)) 

where  m  <  s  <  s  <  M.   Now  we  assume  that  g  '  satisfies  the  Lipschitz 
condition  with  "small"  value  of  the  Lipschitz  constant  L.   Then 

E  <  L[ARGi(s2)-ARGi(s1)]  (3.l6) 

Now, 

H.   =  ARG.  (s_)  -  ARG.  (s_  ) 
l        i  2'      l   1 

s2  e.  -  d.      s±   ex  -  d. 

r0(biKP'2ai)     rO(bi  +  Siai) 

S2  -  sl 

(b.  hs.a. ) (b. +sna. ) 
i  2  i';  i  1  i 

From  this  point  onwards,  we  can  follow  a  procedure  similar  to 
lion  3.1  to  obtain  a  similar  negative  result. 

h .      Conclusion 

Recently,  there  has  been  some  interest  in  the  use  of  continued 
fractions  for  digital  hardware  calculations.   We  require  that  the 
coefficients  of  the  continued  fractions  be  integral  powers  of  two.  As 
a  result,  the  selection  of  coefficients  during  the  iterative  evaluation  of 
a  function  becomes  a  difficult  problem.   Wc  have  shown  that  practical 
Lection  procedures  do  not  exist  for  most  functions  evaluated  using  the 
;  I   quation  approach. 
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